How can I download an article?

To download an article from SID, first log in to the site, search for the article title, and click on the 'Download Article' option.

How can I download an ISI article?

To download an ISI article on SID, enter the keyword or article title in the search bar, view the relevant results, click on the desired article, and select the 'Download Article' option.

How can I access the SID database?

To access the SID database, visit SID.ir, create an account, and log in to access scientific resources.

Is downloading articles from SID free?

Some articles on SID are available for free, while others require payment. Details are specified on the article's page.

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Journal Article

Download

فارسی Version

Title:

Research on Advanced Streaming Processing on Apache Spark

Author(s):

Sasikanth K.V.K. | Samatha K. | Deshai N. | Sekhar B.V.D.S. | Venkatramana S.

Journal:

INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING & PRODUCTION RESEARCH

Issue Info:

Year:
2021
Volume:
32
Issue:
1
Pages:
133-141

Keywords:

Apache Spark

Abstract:

The Today’s digital world computations are tremendously difficult and always demands for essential requirements to significantly process and store enormous size of datasets for wide variety of applications. Since the volume of digital world data is enormous, this is mostly generated unstructured data with more velocity at beyond the limits and double day by day. In last decade, many organizations have been facing major problems to handling and process massive chunks of data, which could not be processed efficiently due to lack of enhancements on existing and conventional technologies. In this paper address, how to overcome these problems as efficiently by using the most recent and world primary powerful data processing tool, which is hadoop clean open source and one of the core component called Map Reduce, but which has few performance issues. This paper main goal is address and overcome the limitations and weaknesses of Map Reduce with Apache Spark.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 12 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Seminar Article

Download

Title:

A Scalable Pattern Mining Method Using Apache Spark Platform

Writer:

Samiei Samaneh | Joodaki Mehdi | GHADIRI NASSER

Conference:

INTERNATIONAL CONFERENCE ON WEB RESEARCH

Issue Info:

Year:
2021
Volume:
7

Keywords:

Apache Spark

Big Data

Log Files Analysis

Sequential Pattern Mining

Abstract:

The amount of data is growing sharply on the Internet. Some data like log files are enormous and entail valuable and precious hidden patterns. In other words, a log file is a set of recorded events that carry beneficial and vital information to develop web server performance, stability server loads, control, and rush up user response operations. However, analyzing massive data take a long time and require powerful hardware. Also, the performance of sequential pattern mining methods is usually unsatisfactory to deal with such data. This paper proposes a novel and advanced parallel method for finding the log file patterns, such as frequent patterns (e. g., URL, IP, Status Code ), how users accessed files, the number of errors, and the most common errors by applying the Apache Spark platform. Experiment results demonstrate that the proposed method's run time on three datasets is significantly less than its four rival pattern mining methods.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0

Seminar Article

Download

Title:

LINKED DATA PARTITIONING FOR RDF PROCESSING ON APACHE SPARK

Writer:

Atashkar Amir Hossein | GHADIRI NASSER | Joodaki Mehdi

Conference:

INTERNATIONAL CONFERENCE ON WEB RESEARCH

Issue Info:

Year:
2017
Volume:
3

Keywords:

Abstract:

RDF MODELS ARE WIDELY USED IN THE WEB OF DATA DUE TO THEIR FLEXIBILITY AND SIMILARITY TO GRAPH PATTERNS. BECAUSE OF GROWING USE OF RDFS, THEIR VOLUMES AND CONTENTS ARE INCREASING. THEREFORE, PROCESSING OF SUCH AMOUNT OF DATA ON A SINGLE MACHINE IS NOT EFFICIENT ENOUGH, BECAUSE OF THE RESPONSE TIME AND LIMITED HARDWARE RESOURCES. AS A RESULT, TO PROCESS THIS DATA MODEL, CLUSTER PROCESSING IS INTRODUCED. ONE OF THESE CLUSTER PROCESSING TOOLS IS APACHE HADOOP. BECAUSE OF USING TOO MUCH OF HARD DISKS, THE RESPONSE TIME IS USUALLY UNACCEPTABLE. IN THIS PAPER, ACCORDING TO THIS PROBLEM, WE USE APACHE SPARK FOR RAPID PROCESSING OF RDF DATA MODELS. ONE KEY FEATURE OF APACHE SPARK IS USING MAIN MEMORY INSTEAD OF HARD DISK, SO THE SPEED OF DATA PROCESSING IS IMPROVED. IN CONTINUES, WE WILL RUN SQL QUERY ON RDF DATA WHICH PARTITIONED ON THE CLUSTER.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 82

Journal Article

Download

فارسی Version

Title:

Performance Evaluation of Apache Spark MLlib Algorithms on an Intrusion Detection Dataset

Author(s):

Atefinia Ramin | Ahmadi Mahmood

Journal:

JOURNAL OF COMPUTING AND SECURITY

Issue Info:

Year:
2022
Volume:
9
Issue:
1
Pages:
57-69

Keywords:

Intrusion Detection Systems Q4

Apache Spark

MLlib

Machine Learning

Abstract:

The increase in the use of the Internet and web services and the advent of the fifth generation of cellular network technology (5G) along with ever-growing Internet of Things (IoT) data traffic will grow global internet usage. To ensure the security of future networks, machine learning-based intrusion detection and prevention systems (IDPS) must be implemented to detect new attacks, and big data parallel processing tools can be used to handle a huge collection of training data in these systems. In this paper Apache Spark, a general-purpose and fast cluster computing platform is used for processing and training a large volume of network traffic feature data. In this work, the most important features of the CSE-CIC-IDS2018 dataset are used for constructing machine learning models and then the most popular machine learning approaches, namely Logistic Regression, Support Vector Machine (SVM), three different Decision Tree Classifiers, and Naive Bayes algorithm are used to train the model using up to eight number of worker nodes. Our Spark cluster contains seven machines acting as worker nodes and one machine is configured as both a master and a worker. We use the CSE-CIC-IDS2018 dataset to evaluate the overall performance of these algorithms on Botnet attacks and distributed hyperparameter tuning is used to find the best single decision tree parameters. We have achieved up to 100% accuracy using selected features by the learning method in our experiments.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 2 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Seminar Article

Download

Title:

Real-Time Blood Pressure Prediction Using Apache Spark and Kafka Machine Learning

Writer:

Farki Ali | Akhondzadeh Noughabi Elham

Conference:

INTERNATIONAL CONFERENCE ON WEB RESEARCH

Issue Info:

Year:
2023
Volume:
9

Keywords:

Abstract:

Using a mix of machine learning algorithms and big data tools, particularly Apache Spark and also Apache Kafka, this research provides a new method for real-time blood pressure prediction. The method can handle large amounts of inbound data from numerous sources, including wearable technology and internet of things monitors. A clustering-based approach is used to improve the blood pressure estimation's precision while the data is being analyzed in real-time. ECG, PPG, and ABP signals dataset are used to assess the suggested strategy, and the findings show a substantial improvement in blood pressure prediction accuracy when compared to previous methods. The suggested method has the potential to be used in numerous uses, such as remote patient tracking, individualized healthcare, and cardiovascular disease early detection. This research offers two contributions. First off, it introduces a novel technique for real-time blood pressure forecast that is more accurate than current approaches. In addition, it shows the value of merging machine learning techniques with real-time streaming data processing systems like Apache Spark and Apache Kafka. Further improving the scalability and accuracy of the system is the use of web-based tools and deep learning methods. The suggested method may have a big impact on how well patients do and how much it will cost to treat them. Overall, this research offers a path that can be useful to both individuals and healthcare professionals for the creation of real-time blood pressure forecast tools.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 28

Journal Article

Download

فارسی Version

Title:

DNA barcoding using particle swarm optimization on apache spark SQL case study: DNA of covid-19

Author(s):

Riza Lala Septem | Nurfathiya Muhammad Ilham | Kusnendar Jajang | Abu Samah Khyrina Airin Fariza

Journal:

INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS

Issue Info:

Year:
2021
Volume:
12
Issue:
Special Issue
Pages:
1561-1572

Keywords:

Big Data Q4

Algorithm

Particle swarm optimization

Similarity check

Motif discovery

DNA barcoding

Abstract:

The objective of this research is to design and implement a computational model to determine DNA barcodes by utilizing the Particle Swarm Optimization (PSO) algorithms implemented on Big Data Platforms, namely Apache Hadoop and Apache Spark. The steps are as follows: (i) inputting DNA sequences to Hadoop Distributed File System (HDFS) in Apache Hadoop, (ii) pre-processing data, (iii) implementing PSO by utilizing the User Defined Function (UDF) in Apache Spark, (iv) collecting results and saving to HDFS. After obtaining the computational model, two following simulations have been done: the first scenario is using 4 cores and several worker nodes, meanwhile, the second one consists of a cluster with 2 worker nodes and several cores. In terms of computational time, the results show a significant acceleration between standalone and big data platforms with both experimental scenarios. This study proves that the computational model built on the big data platform shows the development of features and acceleration of previous research.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 4 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Seminar Article

Download

Title:

A SURVEY ON BIG DATA USAGE IN THE INTERNET OF THINGS

Writer:

Padidaran Moghaddam Farhang | Sadeghi Bajgiran Mahshid | Aghaee Saeed

Conference:

INTERNATIONAL CONFERENCE ON MODERN TECHNOLOGY IN SCIENCES

Issue Info:

Year:
2019
Volume:
2

Keywords:

Abstract:

TODAY, THERE ARE MANY DATA SOURCES DUE TO THE INCREASED USE OF SMART DEVICES, WHICH ARE RESPONSIBLE FOR CONNECTING, COLLECTING, EXCHANGING AND TRANSMITTING THIS DATA VOLUME. IN ADDITION, RESEARCH SHOWS THAT BY THE YEAR 2030 ABOUT A TRILLION SENSORS WILL CONNECT TO THE INTERNET OF THINGS, WHICH WILL COLLECT AND TRANSMIT A LARGE AMOUNT OF DATA. THEREFORE, THERE IS A NEED TO USE LARGE DATA APPLICATIONS IN THE IOT. THESE TECHNOLOGIES ARE INTERDEPENDENT AND MUST BE DEVELOPED TOGETHER. IN THIS PAPER, WE REVIEW SOME OF THE CHALLENGES AND ISSUES OF LARGE DATA, AS WELL AS THE RESEARCH DONE BY OTHER RESEARCHERS. IN THE FOLLOWING, WE EXAMINE THE TWO MAJOR FRAMEWORKS IN THE LARGE DATA AND COMPARE THEM, AND ULTIMATELY EXAMINE THE REQUIREMENTS, AS WELL AS REVIEW THE ANALYTICAL SOLUTIONS IN THIS AREA.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0

Journal Article

Download

انگلیسی Version

Title:

بهینه سازی الگوریتم SPARK، برای تهیه نقشه کاربری اراضی از تصاویر ماهواره ای

Author(s):

علی محمدی عباس | نوابی ایرما | شیرکوند م.

Journal:

نقشه برداری

Issue Info:

Year:
1383
Volume:
-
Issue:
1
Pages:
5-12

Keywords:

Abstract:

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Journal Article

Download

فارسی Version

Title:

Behavior-Based Online Anomaly Detection for a Nationwide Short Message Service

Author(s):

Shaeiri Z. | Kazemitabar J. | Bijani Sh. | TALEBI M.

Journal:

JOURNAL OF ARTIFICIAL INTELLIGENCE AND DATA MINING

Issue Info:

Year:
2019
Volume:
7
Issue:
2
Pages:
239-247

Keywords:

Short Message Service

Behavioral Profiling

Anomaly Detection

Apache Spark

Abstract:

As fraudsters understand the time windows and act fast, real-time fraud management systems becomes necessary in the Telecommunication Industry. In this work, by analyzing the traces collected from a nationwide cellular network over a period of a month, an online behavior-based anomaly detection system is provided. Over time, users' interactions with the network provide a vast amount of data usage. This data usage is modeled to profiles by which the users can be identified. A statistical model is proposed, which allocates a risk number to each upcoming record, which reveals deviation from the normal behavior stored in profiles. Based on the amount of this deviation, a decision is made to flag the record as normal or abnormal. If the activity is normal, the associated profile is updated; otherwise, the record is flagged as abnormal, and it will be considered for further investigations. For handling the big dataset and implementing the methodology, we used the Apache Spark engine, which is an open source, fast, and general-purpose cluster computing system for big data handling and analysis. The experimental results show that the proposed approach can perfectly detect deviations from the normal behavior, and can be exploited for detecting anomaly patterns.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 122 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Citation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Refrence 0

Seminar Article

Download

Title:

An Approach to Improve Apriori Algorithm for Extraction of Frequent Itemsets

Writer:

Shayegan Mohammad Javad | Asgari Namin Parsa

Conference:

INTERNATIONAL CONFERENCE ON WEB RESEARCH

Issue Info:

Year:
2021
Volume:
7

Keywords:

Apache Spark

Frequent Patterns

Association Rule Mining

Apriori Algorithm

Abstract:

The amount of data generated today regarding volume, generation velocity, and variety is quite immense. This, in turn, has created a great challenge for scientists and researchers. To devise a solution, researchers have suggested a variety of schemes to help alleviate this problem. One of the suggested schemas is Association Rule Mining, and it is primarily focused on finding the associations in transaction-like data. To assist in finding such associations, Frequent Itemsets should be discovered first. Therefore, this research is a new approach to finding Frequent Itemsets and it is based on the Apriori algorithm and Apache Spark distributed platform. Further, we introduce an extended version of Apriori which tends to find Maximal Frequent Itemsets first to help speed up the mining process. The results and comparison to algorithms like YAFIM and HFIM and the original Apriori show the suggested algorithm outperforms them in dense datasets by an average of 38 percent.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

Download 0

ابتدا 1 2 3 4 5 6 7 8 9 10 انتها ›

بعدی

Scientific Information Database

ISSN: 2588-4824

Search Result

Relevance

Newest

Most Viewed

Most Downloaded

Most Cited